Skip to content

chore(dx): align local typecheck workflow#884

Open
bokelley wants to merge 5 commits into
mainfrom
compare-ty-pyrefly-typing
Open

chore(dx): align local typecheck workflow#884
bokelley wants to merge 5 commits into
mainfrom
compare-ty-pyrefly-typing

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

@bokelley bokelley commented May 27, 2026

Summary

  • add a canonical make bootstrap setup path for uv-managed dev dependencies and hooks
  • add make typecheck-all so source mypy, adopter type fixtures, and the no-ignore contract run together
  • enforce that tests/type_checks/ fixtures do not use real # type: ignore suppressions
  • align README, CONTRIBUTING, CLAUDE, pre-commit, and CI around the same type-check workflow
  • pin mypy to 1.20.2 across pip and uv dev paths
  • enable mypy warn_unreachable and strict_equality_for_none, with small type-boundary cleanups needed to make those checks pass
  • refresh the reference seller/proposal lifecycle behavior needed by the current 3.1 storyboard runner

Why

The repo already had strong mypy settings, but local docs, Make targets, pre-commit, and CI were teaching slightly different workflows. This makes the contributor path and the adopter type-check contract explicit and executable.

The follow-up type cleanup turns the useful stricter mypy probe into an enforced project setting instead of a one-off advisory command.

The 3.1 storyboard runner now exercises fixed-price product paths, viewability totals, and correctable recovery for unknown proposal IDs. The reference examples and proposal guards now match that contract while keeping cross-tenant proposal lookups collapsed into PROPOSAL_NOT_FOUND.

Validation

  • make ci-local
  • ADCP_RUNNER_BIN=adcp ADCP_PORT=3102 STORYBOARD_RESULT_PATH=.context/storyboard-result-adcp-3.1-rerun.json SELLER_LOG_PATH=.context/reference-seller-adcp-3.1-rerun.log scripts/ci/run_storyboard_reference_seller.sh
  • ADCP_SDK_VERSION=adcp-3.0 ADCP_PORT=3103 STORYBOARD_RESULT_PATH=.context/storyboard-result-adcp-3.0-rerun.json SELLER_LOG_PATH=.context/reference-seller-adcp-3.0-rerun.log scripts/ci/run_storyboard_reference_seller.sh
  • adcp storyboard run http://127.0.0.1:3107/mcp media_buy_seller/proposal_finalize --json --allow-http
  • focused ruff and pytest checks while iterating on the stricter mypy cleanup
  • pre-commit hook suite on commit
  • uv run --extra dev pre-commit run adopter-type-checks --all-files
  • uv run --extra dev pre-commit run adopter-type-ignore-contract --all-files
  • uv run --extra dev pre-commit run check-yaml --all-files
  • uv run --extra dev black --check scripts/check_type_ignore_contract.py
  • git diff --check

@bokelley bokelley changed the title [codex] Align local typecheck workflow chore(dx): align local typecheck workflow May 27, 2026
@bokelley bokelley marked this pull request as ready for review May 27, 2026 10:00
aao-ipr-bot[bot]
aao-ipr-bot Bot previously approved these changes May 27, 2026
Copy link
Copy Markdown

@aao-ipr-bot aao-ipr-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Follow-ups noted below. The mypy warn_unreachable + strict_equality_for_none flip is the right shape — load-bearing dead branches caught at compile time beats defensive is None in every getter.

Things I checked

  • capabilities.py:215-226 — removed if self._caps.adcp is None: return False. Safe. adcp is required (non-Optional) on GetAdcpCapabilitiesResponse per types/generated_poc/protocol/get_adcp_capabilities_response.py:1363. warn_unreachable was right.
  • proposal_dispatch.py new maybe_validate_refine_proposal_refs (51 LoC). Skips action="finalize" entries (maybe_intercept_finalize handles those); uses expected_account_id=ctx.account.id so cross-tenant probes collapse to the same PROPOSAL_NOT_FOUND as missing-record. repr()-formatted proposal_id in the message bounds log-injection from a buyer-controlled id. Wired into handler.get_products before the has_refine_support check. Right shape.
  • PROPOSAL_NOT_FOUND recovery downgrade terminalcorrectable. Aligned at proposal_lifecycle.py:114, proposal_dispatch.py:208, and the new helper. Cross-tenant probe and missing-record both return identical None from the in-memory store and the PG store — no principal-enumeration discriminator introduced.
  • Server-side decisioning-error shim refactor (server/serve.py:2371, server/translate.py:117, server/a2a_server.py:50). DecisioningAdcpError = None replaced with _DECISIONING_ADCP_ERROR_TYPES: tuple[...]. isinstance(x, ()) returns False — preserved semantics. protocols/mcp.py:42 does the same for HTTPStatusError. Clean.
  • scripts/check_type_ignore_contract.py (46 LoC). Walks COMMENT tokens only, so docstring mentions of "type: ignore" don't trip it. comment.startswith("type: ignore") catches both bare and type: ignore[code] forms — intentional and matches the no-suppression contract.
  • Public Protocol widening: BuyerAgentRegistry.resolve_by_credential accepts Credential (now includes HttpSigCredential). Framework dispatch at decisioning/handler.py walls HttpSig off to resolve_by_agent_url so the bearer-resolution path is never invoked with an HttpSig credential at runtime — no signature-verification bypass. security-reviewer: no High, no Brian-blocker.
  • adopter-type-ignore-contract hook + CI step. Two checks look duplicate but aren't: mypy --strict enforces correctness, the script enforces no suppressions on top. Complementary.

Follow-ups (non-blocking — file as issues)

  • recovery="terminal" still in try_reserve_consumption for PROPOSAL_NOT_FOUND. decisioning/proposal_store.py:562-564 (in-memory) and decisioning/pg/proposal_store.py:691-693 (PG) still emit terminal for the same error code that's now correctable on the pre-check path. The enforce_proposal_expiry precheck at proposal_dispatch.py shadows these on the happy path, but a record evicted between expiry-check and lock-acquisition surfaces the stale terminal recovery. Bring both store sites in line with the new posture so the wire contract is uniform.
  • _credential_key doesn't handle HttpSigCredential. decisioning/registry_cache.py:152-163 dispatches on ApiKeyCredential / OAuthCredential and falls into unknown:HttpSigCredential for the third arm. Unreachable today — handler.py never calls resolve_by_credential with HttpSig — but the Protocol now publicly advertises the broader signature without a corresponding cache key. Add an explicit HttpSigCredential → f"http_sig:{keyid}:{agent_url}" branch so the next caller refactor can't quietly create a cache-collision bug.
  • PR title vs release-please. This SDK uses release-please with changelog-sections limited to feat|fix|perf|revert|deps. e63b384a fix(storyboards): align examples with 3.1 runner is the wire-visible recovery change — if this PR squash-merges under chore(dx), the fix gets buried with no version bump or changelog entry. Either preserve the per-commit history at merge or retitle to surface the fix: semantic.
  • pre-commit is only in [dependency-groups].dev (PEP 735), not [project.optional-dependencies].dev. A contributor following the legacy pip install -e ".[dev]" path (still a documented target in the Makefile as install-dev) will not get pre-commit. Either move it into the project extras so both paths work, or change the install-dev help string to flag that pre-commit is uv-only.

Minor nits (non-blocking)

  1. webhooks.py:1074-1089: boundary fail-open softening. validate_webhook_destination_url previously mapped non-str/non-AnyUrl inputs to url_text = Nonereason="missing_url". Now they hit str(...) and fall through to scheme/SSRF checks, which still reject — but the error category becomes invalid_scheme instead of the more accurate missing_url. Type-boundary fail-closed beats trusting downstream guards here.
  2. scripts/check_type_ignore_contract.py: startswith("type: ignore") over-matches. Catches a stray type: ignored typo as if it were a real ignore. Tighten to comment == "type: ignore" or comment.startswith("type: ignore[").
  3. scripts/check_type_ignore_contract.py: unhandled tokenize.TokenizeError. A future syntax error in tests/type_checks/ crashes the CI step instead of pointing at the bad file.
  4. pyproject.toml:126: mypy==1.20.2 pin is unexplained. Neighboring comment at :317-323 explains why mypy is duplicated across [project.optional-dependencies] and [dependency-groups] but not why this exact patch. One-line rationale would help the next reader who tries to bump it.
  5. CONTRIBUTING.md: make bootstrap advertises one step but does two. The commit-msg hook install is invisible to a contributor debugging a missing hook. One trailing parenthetical — "(installs pre-commit + commit-msg hooks)" — fixes it.
  6. tests/test_proposal_lifecycle_e2e.py: per-test from adcp.types import .... Three new tests do the import inside the function body; move to the module-level import block for consistency with the rest of the file.

Three commits, three conventional-commit flavors, one PR title — worth a passing thought before merge.

…ping

# Conflicts:
#	examples/seller_agent.py
#	src/adcp/decisioning/proposal_lifecycle.py
#	tests/test_proposal_lifecycle_e2e.py
Copy link
Copy Markdown

@aao-ipr-bot aao-ipr-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two real things to address before approve. The DX/bootstrap work is clean and the new refine guard's cross-tenant scoping is right.

Things I checked

  • make bootstrap + make typecheck-all + scripts/check_type_ignore_contract.py lock down the adopter type-check fixture contract. The script walks tests/type_checks/ with tokenize and fails on any # type: ignore. Right shape, and pre-commit + CI both wire it.
  • maybe_validate_refine_proposal_refs at src/adcp/decisioning/proposal_dispatch.py:340-388 threads expected_account_id=ctx.account.id into store.get. Cross-tenant probes collapse to PROPOSAL_NOT_FOUND per the principal-enumeration contract at proposal_lifecycle.py:68-74. Verified in both InMemoryProposalStore.get and PgProposalStore.get — single dict-lookup / single SELECT, no second-hop timing side-channel. security-reviewer: not a ship-blocker.
  • Mypy guard removals: capabilities.py:218-222 (_caps.adcp non-Optional in schema), server/responses.py:85 (preceding branch raises), signing/brand_jwks.py:436 (_inner non-None at :423), validation/oneof_hints.py (callers filter to dicts upstream) — all correctly dead branches.
  • Tuple-of-types refactor in protocols/mcp.py, server/a2a_server.py:50-53, server/serve.py:2371-2375, server/translate.py:119-122 is semantically equivalent. isinstance(exc, ()) returns False; cast(Any, exc) in translate.py:144 is needed because mypy can't narrow through a variable-typed tuple.
  • Pinning mypy==1.20.2 in both [dev] extras and [tool.uv] dev is the right call — anything looser reproduces the CI/local error-count split the comment at pyproject.toml:321 was already worried about.

Issues

  1. _part_data_dict regression in src/adcp/server/a2a_server.py:139-142 (and mirrored in src/adcp/protocols/a2a.py:39-44). part.data is google.protobuf.Value, not Struct — see _make_data_part:166-170 which builds it via ParseDict(data, Value()). MessageToDict on a Value returns whatever JSON kind it holds: None, bool, float, str, list, or dict. The pre-PR isinstance(value, dict) guard caught non-struct kinds and returned None so the parser fell through to TextPart parsing. Post-PR, a peer sending Part(data=Value(string_value=...)) returns a str from a function annotated -> dict[str, Any] | None. Caller at a2a_server.py:424 does data.get("skill") and raises AttributeError. _parse_request is called at :269, before the try at :289, so the exception escapes the structured adcp_error envelope. Happy path is unchanged; adversarial path is now a 500. Restore the guard, or change the signature to Any and add isinstance at the callsites. code-reviewer flagged this as a Blocker — downgrading because it's not a happy-path crash, but it should still be fixed before ship.

  2. Neither new test actually exercises maybe_validate_refine_proposal_refs. tests/test_proposal_lifecycle_e2e.py:280-305 uses action: \"finalize\" — the new guard explicitly skips finalize (proposal_dispatch.py:362-364). The error comes from the pre-existing maybe_intercept_finalize. tests/test_proposal_lifecycle_e2e.py:374-400 calls create_media_buy, but the new guard is wired into get_products only (handler.py:1804). The error comes from proposal_lifecycle.enforce_proposal_expiry. The pre-existing test_refine_unknown_proposal_is_correctable_not_found:198 does land on the new code path, but the PR doesn't make that explicit. Add a direct test with action=\"include\" or action=\"omit\" and an unknown proposal_id against get_products, asserting the manager's refine method was not called.

Follow-ups (non-blocking — file as issues)

  • BuyerAgentRegistry.resolve_by_credential widened from ApiKeyCredential | OAuthCredential to Credential across registry.py:219-220, registry_cache.py:334/532/634, pg/buyer_agent_registry.py:266. BuyerAgentRegistry is a Protocol surface adopters implement. PEP 544 contravariance accepts broader impls, but adopter registries that kept the narrower annotation will now fail Liskov against the published Protocol under strict mypy. Pre-1.0 bump-minor-pre-major: true in release-please-config.json:7 absorbs the semver impact, but a CHANGELOG line or BREAKING CHANGE: footer would give adopters the breadcrumb. ad-tech-protocol-expert: sole real adopter-breaking change in the diff.
  • refine[] at get_products_request.py:172-178 has min_length=1 and no max_length. The new guard does one sequential await store.get(...) per proposal-scoped entry. Per _size_limit.py:44 (10 MB cap), an attacker can pack ~170k entries per request → ~170k sequential DB roundtrips. Fix belongs at the schema layer, not in this PR.
  • webhook_supervisor.py:324 and webhook_supervisor_pg.py:213 widened sender: WebhookSenderWebhookSender | None, but both constructors raise ValueError on None at the next few lines. Type lies in the other direction — adopters reading the signature will think None is legal. Drop the runtime check or keep the narrower type.
  • BuyerAgent.status: BuyerAgentStatus | str at registry.py:158 is a runtime no-op (BuyerAgentStatus is a Literal[...], runtime is str) but worth a one-line CHANGELOG note for adopters that do exhaustive narrowing.

Minor nits (non-blocking)

  1. handler.py:1336. _resolve_account(ref: AccountReference | None, ...)ref: object | None. Internal method, low impact, but AccountReference | None was more honest — if mypy was the friction, a cast at the offending callsite would have kept the public type lie out of the signature.
  2. Commit-set-vs-diff mismatch. Title is chore(dx): align local typecheck workflow. The diff also adds a new decisioning guard (maybe_validate_refine_proposal_refs), widens a Protocol method, and adds a correctable PROPOSAL_NOT_FOUND test surface. A separate fix(decisioning): commit would have landed those in the right CHANGELOG bucket instead of getting swept under chore(dx). Notable, given the type-pin in this same PR comments specifically about CI/local drift.

LGTM after (1) is fixed and (2) is either re-tested or the existing coverage is called out explicitly. Once the parsing path is robust again and the new guard has a test that actually targets it, this is safe to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant